Goto

Collaborating Authors

 frequentist approach


Recursive PAC-Bayes: A Frequentist Approach to Sequential Prior Updates with No Information Loss

Neural Information Processing Systems

PAC-Bayesian analysis is a frequentist framework for incorporating prior knowledge into learning. It was inspired by Bayesian learning, which allows sequential data processing and naturally turns posteriors from one processing step into priors for the next. However, despite two and a half decades of research, the ability to update priors sequentially without losing confidence information along the way remained elusive for PAC-Bayes. While PAC-Bayes allows construction of data-informed priors, the final confidence intervals depend only on the number of points that were not used for the construction of the prior, whereas confidence information in the prior, which is related to the number of points used to construct the prior, is lost. This limits the possibility and benefit of sequential prior updates, because the final bounds depend only on the size of the final batch.We present a novel and, in retrospect, surprisingly simple and powerful PAC-Bayesian procedure that allows sequential prior updates with no information loss.


A Bayesian Bradley-Terry model to compare multiple ML algorithms on multiple data sets

Wainer, Jacques

arXiv.org Artificial Intelligence

This paper proposes a Bayesian model to compare multiple algorithms on multiple data sets, on any metric. The model is based on the Bradley-Terry model, that counts the number of times one algorithm performs better than another on different data sets. Because of its Bayesian foundations, the Bayesian Bradley Terry model (BBT) has different characteristics than frequentist approaches to comparing multiple algorithms on multiple data sets, such as Demsar (2006) tests on mean rank, and Benavoli et al. (2016) multiple pairwise Wilcoxon tests with p-adjustment procedures. In particular, a Bayesian approach allows for more nuanced statements regarding the algorithms beyond claiming that the difference is or it is not statistically significant. Bayesian approaches also allow to define when two algorithms are equivalent for practical purposes, or the region of practical equivalence (ROPE). Different than a Bayesian signed rank comparison procedure proposed by Benavoli et al. (2017), our approach can define a ROPE for any metric, since it is based on probability statements, and not on differences of that metric. This paper also proposes a local ROPE concept, that evaluates whether a positive difference between a mean measure across some cross validation to the mean of some other algorithms is should be really seen as the first algorithm being better than the second, based on effect sizes. This local ROPE proposal is independent of a Bayesian use, and can be used in frequentist approaches based on ranks. A R package and a Python program that implements the BBT is available.


How Bayesian Neural Networks behave part1(Machine Learning)

#artificialintelligence

Abstract: We have constructed a Bayesian neural network able of retrieving tropospheric temperature profiles from rotational Raman-scatter measurements of nitrogen and oxygen and applied it to measurements taken by the RAman Lidar for Meteorological Observations (RALMO) in Payerne, Switzerland. We give a detailed description of using a Bayesian method to retrieve temperature profiles including estimates of the uncertainty due to the network weights and the statistical uncertainty of the measurements. We trained our model using lidar measurements under different atmospheric conditions, and we tested our model using measurements not used for training the network. The computed temperature profiles extend over the altitude range of 0.7 km to 6 km. The mean bias estimate of our temperatures relative to the MeteoSwiss standard processing algorithm does not exceed 0.05 K at altitudes below 4.5 km, and does not exceed 0.08 K in an altitude range of 4.5 km to 6 km.


Frequentist vs. Bayesian Statistics with Tensorflow

#artificialintelligence

This article belongs to the series "Probabilistic Deep Learning". This weekly series covers probabilistic approaches to deep learning. The main goal is to extend deep learning models to quantify uncertainty, i.e. know what they do not know. The frequentist approach to statistics is based on the idea of repeated sampling and long-run relative frequency. It involves constructing hypotheses about a population and testing them using sample data.


Frequentist vs Bayesian Statistics: Which One Is Best! - Experfy Insights

#artificialintelligence

While performing statistical analysis, oftentimes, we face the dilemma about Frequentist Vs Bayesian Strategy for the problem. This choice becomes critical when working with limited-sized datasets. And, if you use one method over the other without having a fundamental understanding of the assumptions and limitations of the two approaches, then you could increase your chance of making a wrong inference. The philosophical divide between Frequentist Vs Bayesian statistics goes back 250 years. The Bayesian approach dominated 19th-century statistics, while the Frequentist approach gained popularity in the 20th century.


The Bayesian vs frequentist approaches: implications for machine learning – Part two

#artificialintelligence

Sampled from a distribution: Many machine learning algorithms make assumptions that the data is sampled from a frequency. For example, linear regression assumes gaussian distribution and logistic regression assumes that the data is sampled from a Bernoulli distribution.


How Machine Learning Models Fail in the Real World

#artificialintelligence

This article was published as a part of the Data Science Blogathon. Yesterday, my brother broke an antique at home. I began to search for FeviQuick (a classic glue) to put it back together. Given that it's one of the most misplaced items, I began to search for it in every possible drawer and every untouched corner of the house I hadn't been to in the past 3 months. I gave up the search after an hour – the FeviQuick was nowhere to be found.


Estimating g-Leakage via Machine Learning

Romanelli, Marco, Chatzikokolakis, Konstantinos, Palamidessi, Catuscia, Piantanida, Pablo

arXiv.org Machine Learning

This paper considers the problem of estimating the information leakage of a system in the black-box scenario. It is assumed that the system's internals are unknown to the learner, or anyway too complicated to analyze, and the only available information are pairs of input-output data samples, possibly obtained by submitting queries to the system or provided by a third party. Previous research has mainly focused on counting the frequencies to estimate the input-output conditional probabilities (referred to as frequentist approach), however this method is not accurate when the domain of possible outputs is large. To overcome this difficulty, the estimation of the Bayes error of the ideal classifier was recently investigated using Machine Learning (ML) models and it has been shown to be more accurate thanks to the ability of those models to learn the input-output correspondence. However, the Bayes vulnerability is only suitable to describe one-try attacks. A more general and flexible measure of leakage is the g-vulnerability, which encompasses several different types of adversaries, with different goals and capabilities. In this paper, we propose a novel approach to perform black-box estimation of the g-vulnerability using ML. A feature of our approach is that it does not require to estimate the conditional probabilities, and that it is suitable for a large class of ML algorithms. First, we formally show the learnability for all data distributions. Then, we evaluate the performance via various experiments using k-Nearest Neighbors and Neural Networks. Our results outperform the frequentist approach when the observables domain is large.


Bayesian Statistics: From Concept to Data Analysis Coursera

#artificialintelligence

About this course: This course introduces the Bayesian approach to statistics, starting with the concept of probability and moving to the analysis of data. We will learn about the philosophy of the Bayesian approach as well as how to implement it for common types of data. We will compare the Bayesian approach to the more commonly-taught Frequentist approach, and see some of the benefits of the Bayesian approach. In particular, the Bayesian approach allows for better accounting of uncertainty, results that have more intuitive and interpretable meaning, and more explicit statements of assumptions. This course combines lecture videos, computer demonstrations, readings, exercises, and discussion boards to create an active learning experience.


Is Bayesian A/B Testing Immune to Peeking? Not Exactly

#artificialintelligence

Since I joined Stack Exchange as a Data Scientist in June, one of my first projects has been reconsidering the A/B testing system used to evaluate new features and changes to the site. Our current approach relies on computing a p-value to measure our confidence in a new feature. Unfortunately, this leads to a common pitfall in performing A/B testing, which is the habit of looking at a test while it's running, then stopping the test as soon as the p-value reaches a particular threshold- say, .05. This seems reasonable, but in doing so, you're making the p-value no longer trustworthy, and making it substantially more likely you'll implement features that offer no improvement. How Not To Run an A/B Test gives a good explanation of this problem.